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Abstract: We develop a rounding method based on random walks in polytopes, which leads to im- 
proved approximation algorithms and integrality gaps for several assignment problems that arise in 
resource allocation and scheduling. In particular, it generalizes the work of Shmoys & Tardos on the 
generalized assignment problem in two different directions, where the machines have hard capacities, 
and where some jobs can be dropped. We also outline possible applications and connections of this 
methodology to discrepancy theory and iterated rounding. 
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1 Introduction 

The "relax-and-round" paradigm is a well- 
known approach in combinatorial optimization. 
Given an instance of an optimization problem, we 
enlarge the set of feasible solutions / to some set 
I' D I — often the linear-programming (LP) re- 
laxation of the problem; we then map an (effi- 
ciently computed, optimal) solution x* e I' to 
some "nearby" x € I and prove that x is near- 
optimal in /. This second "rounding" step is of- 
ten a crucial ingredient, and many general tech- 
niques have been developed for it. In this work, we 
present a new rounding methodology which leads 
to several improved approximation algorithms in 
scheduling, and which, as we explain, appears to 
have connections and applications to other tech- 
niques and problems, respectively. 

We next present background on (randomized) 
rounding and a fundamental scheduling problem, 
before describing our contribution. 

Our work generalizes various dependent ran- 
domized rounding techniques that have been de- 
veloped over the past decade or so. Recall that 
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in randomized rounding, we use randomization to 
map x* = (x*,x%, . . . , x*) back to some x = 
(xi, X2, . . . , x„) B39I1 . Typically, we choose a 
value a that is problem-specific, and, indepen- 
dently for each i, define Xj to be 1 with proba- 
bility ax*, and to be with the complementary 
probability of 1 — ax* . Independence can, how- 
ever, lead to noticeable deviations from the mean 
for random variables that are required to be very 
close to (or even be equal to) their mean. A fruit- 
ful idea developed in 1E7II32 , 45 ] is to carefully in- 



ttoduce dependencies into the rounding process: in 
particular, some sums of random variables are held 
fixed with probability one, while still retaining ran- 
domness in the individual variables and guaran- 
teeing certain types of negative-correlation prop- 
erties among them. See QjJ] for a related determin- 
istic approach that precedes these works. These 
dependent-rounding approaches lead to numerous 
improved approximation algorithms in scheduling 
and packet-routing llL l27Ll32ll45h . 

We now introduce a fundamental scheduling 
model, which has spurred many advances and ap- 
plications in combinatorial optimization, including 
linear-, quadratic- & convex-programmin g re lax- 
ations and new rounding approaches J3, fsL 1 1 OL 15 , 
0, |H [H [M SH B. This model, scheduling 
with unrelated parallel machines (UPM) - and its 
relatives - play a key role in this work. Herein, we 
are given a set J of n jobs, a set M of m machines, 



1 



and non-negative values pij (i 6 M, j 6 J): 
each job j has to be assigned to some machine, 
and assigning it to machine i will impose a pro- 
cessing time of pi.j on machine i, (The word "un- 
related" arises from the fact that there may be no 
pattern among the given numbers Pi.j.) Variants 
such as the type of objective function(s) to be op- 
timized in such an assignment, whether there is an 
additional "cost-function", whether a few jobs can 
be dropped, and situations where there are release 
dates for, and precedence constraints among, the 
jobs, lead to a rich spectrum of problems and tech- 
niques. We now briefly discuss two such highly- 
impactful results 11341 kill . The primary UPM ob- 
jective in these works is to minimize the makespan 
- the maximum total load on any machine. It is 
shown in [34] that this problem can be approxi- 
mated to within a factor of 2; furthermore, even 
some natural special cases cannot be approximated 
better than 1.5 unless P = NP [34]. Despite 
much effort, these bounds have not been improved. 
The work of lk~Hl builds on the upper-bound of 
ll34ll to consider the generalized assignment prob- 
lem (GAP) where we incur a cost Cjj if we sched- 
ule job j on machine i; a simultaneous (2, 1)- 
approximation for the (makespan, total cost)-pair 
is developed in |4l[|. leading to numerous applica- 
tions (see, e.g., fMtffo. 



We generalize the methods of lllL l27Ll3lLl32Lk5l1 . 
via a type of random walk toward a vertex of 
the underlying polytope that we outline next. We 
then present several applications in scheduling and 
bipartite matching through problem-specific spe- 
cializations of this approach, and discuss further 
prospects for this methodology. 

The rounding approaches of 
generalized to linear systems as follows in 
Suppose we have an n-dimensional constraint sys- 
tem Ax < b with the additional constraints that 
x € [0, 1]™. This will often be a LP-relaxation, 
which we aim to round to some y £ {0, 1}™ such 
that some constraints in "Ay < 6" hold with prob- 
ability one, while the rest are violated "a little" 
(with high probability). Given some x € [0, 1]™, 
the rounding approach of H32fl is as follows. First, 
we assume without loss of generality that x £ 
(0, 1)™: those Xj that get rounded to or 1 at 
some point, are held fixed from then on. Next, 
we "judiciously" drop some of the constraints in 



"Ax < 6" until the number of constraints becomes 
smaller than n, thus making the system linearly- 
dependent - leading to the efficient computation 
of an r € Si" that is in the nullspace of this re- 
duced system. We then compute positive scalars a 
and j3 such that x\ :— x + ar and X2 :— x — j3r 
both lie in [0, 1]™, and both have at least one com- 
ponent lying in {0, 1}; we then update a; to a ran- 
dom Y as: Y :— X\ with probability j3/(a + (3), 
and Y := X2 with the complementary probability 
a j (a + 0). Thus we have rounded at least one 
further component of x, and also have the useful 
property that for all j, E[Yj] — Xj. Different ways 
of conducting the "judicious" reduction lead to a 



variety of improved scheduling algorithms in [32]. 
The setting of I27II45I1 on bipartite 6-matchings can 
be interpreted in this framework. 



We further generalize the above-sketched ap- 
proach of [32]. Suppose we are given a poly- 
tope V in n dimensions, and a non-vertex point 
x belonging to V . An appropriate basic-feasible 
solution will of course lead us to a vertex of V, 
but we approach (not necessarily reach) a vertex 
of V by a random walk as follows. Let C de- 
note the set of constraints defining V which are 
satisfied tightly (i.e., with equality) by x. Then, 
note that there is a non-empty linear subspace S 
of 5ft" such that for any nonzero r G S, we can 
travel up to some strictly-positive distance f(r) 
along r starting from x, while staying in V and 
continuing to satisfy all constraints in C tightly. 
Our broad approach to conduct a random move 
Y := x + R by choosing an appropriately random 
R from S, such that the property "E[Yj] = xf 
of the previous paragraph still holds. In particular, 
let RandMove(a;, V) - or simply RandMove(a;) 
if V is understood - be as follows. Choose a 
nonzero r E S arbitrarily, and set Y :— x + f(r)r 
with probability /(— r)/(/(r)+/(— r)), and!" := 
x — f(—r)r with the complementary probability 
°f /( r )/(/( r ) + f(— r ))- Note that if we repeat 
RandMove, we obtain a random walk that finally 
leads us to a vertex of V; the high-level idea is 
to intersperse this walk with the idea of "judi- 
ciously dropping some constraints" from the pre- 
vious paragraph, as well as combining certain con- 
straints together into one. Three major differences 
from Il32ll are: (a) the care given to the tight con- 
straints C, (b) the choice of which constraint to 
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drop being based on C, and (c) clubbing some con- 
straints into one. As discussed next, this recipe ap- 
pears fruitful in a number of directions in schedul- 
ing, and as a new rounding technique in general. 

Capacity constraints on machines, random match- 
ings with sharp tail bounds. Handling "hard ca- 
pacities" - those that cannot be violated - is gen- 
erally tricky in various settings, including facility- 
location and other covering problems lfl9l l3al . 
Motivated by problems in crew-scheduling 11221 
l40fl and by the fact that servers have a limit on 
how many jobs can be assigned to them, the nat- 
ural question of scheduling with a hard capacity- 
constraint of "at most bi jobs to be scheduled on 
each machine i" has been studied in |18, 48, 50l — 
521. Most recently, the work of llTsTl has shown 
that this problem can be approximated to within 
a factor of 3 in the special case where the ma- 
chines are identical (job j has processing time pj 
on any machine). In § [2] we use our random- 
walk approach to generalize this to the setting of 
GAP and obtain the GAP bounds of [41] - i.e., 
approximation ratios of 2 and 1 for the makespan 
and cost respectively, while satisfying the capac- 
ity constraints: the improvements are in the more- 
general scheduling model, adding the cost con- 
straint, and in the approximation ratio. We an- 
ticipate that such a capacity-sensitive generaliza- 
tion of l4lll would lead to improved approxima- 
tion algorithms for several applications of GAP, 
and present one such in Section [5] 

Theorem Q] generalizes such capacitated prob- 
lems to random bipartite (fr-)matchings with target 
degree bounds and sharp tail bounds for given lin- 
ear functions; see 1I23I1 to applications to models 
for complex networks. Recall that a (6)-matching 
is a subgraph in which every vertex v has degree at 
most b(v). Given a fractional (fe)-matching a; in a 
bipartite graph G = (J, M, E) of N vertices and 
a collection of k linear functions {/j} of x, many 
works have considered the problem of construct- 
ing (&-)matchings X such that fi(X) is "close" 
to fi{x) simultaneously for each i ol I27ll28l 13811 . 
The works 
k; those of 

the usual "discrepancy" term of fi(x) log N) 
in \fi(X) — fi(x)\ for most/all i; in a few cases, 
o(N) vertices will have to remain unmatched also. 
In contrast, Theorem Q] shows that if there is one 



281 |38ll focus on the case of constant 
3, 27] consider general k, and require 



structured objective function /j with bounded co- 
efficients associated with each i G M, then in fact 
all the \fi(X)— fi(x)\ can be bounded independent 
of N. This appears to be the first such result here, 
and helps with equitable max-min fair allocations 
as discussed below. 

Scheduling with outliers: makespan and fairness. 
Note that the (2, 1) bicriteria approximation that 
we obtain for GAP above, generalizes the results 
of l4lll . We now present such a generalization 
in another direction: that of "outliers" in schedul- 
ing M29I1 . For instance, suppose in the "processing 
times pij and costs c^" setting of GAP, we also 
have a profit Hj for choosing to schedule each job 
j. Given a "hard" target profit II, target makespan 
T and total cost C, the LP-rounding method of 
lE9ll either proves that these targets are not simul- 
taneously achievable, or constructs a schedule with 
values (II, 3T, C(l + e)) for any constant e > 0. 
Using our rounding approach, we improve this to 
(n, (2 + e)T, C(l + e)) in §0 (The factors of e in 
the cost are required due to the hardness of knap- 
sack {l^ .) Also, fairness is a fundamental issue 
in dealing with outliers: e.g., in repeated runs of 
such algorithms, we may not desire long starvation 
of individual job(s) in sacrifice to a global objec- 
tive function. Theorem [7] accommodates fairness 
in the form of scheduling-probabilities for the jobs 
that can be part of the input. 

Max-Min Fair Allocation. This problem, also 
known as the Santa Claus problem, is the max- 
min version of UPM, where we aim to maximize 
the minimum "load" (viewed as utility) on the ma- 
chines; it has received a good deal of attention re- 
cently SHU QHH HI. We are aWe l ° employ 
dependent randomized rounding to near-optimally 

determine the integrality gap of a well-studied LP 
relaxation. Also, Theorem Q] lets us generalize a 
result of Il411 on max-min fairness to the setting of 
equitable partitioning of the jobs; see §|4] 

Directions for the future: some potential con- 
nections and applications. Distributions on struc- 
tured matchings in bipartite graphs is a topic that 
models many scenarios in discrete optimization, 
and we view our work as a useful contribution 
to it. We explore further applications and con- 
nections in § [6j A general question involving 
"rounding well" is the lattice approximation prob- 
lem JH: given A G {0, l} mxn and p G [0, 1]", 
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we want ag<E {0, 1}™ such that \\A ■ (q - p)^ 
is "small"; the linear discrepancy of A is defined 
to be lindisc(A) = max pe [ .i]" mm <je{o.i}™ \\A- 
(<7 ~ P)lloo- The field of combinatorial discrep- 
ancy theory lfl3ll has developed several classical 
results that bound lindisc(A) for various matrix 
families A; column-sparse matrices have received 
much attention in this regard. Section [6] discusses 
a concrete approach to use our method for the 
famous Beck-Fiala conjecture on the discrepancy 
of column-sparse matrices IU2I1 . in the setting of 
random matrices. § [6] also suggests that there 
may be deeper connections to iterated rounding, 
a fruitfu l ap proach in approximation algorithms 
1 25, 30[ 33, El, 49|. We view our approach as 
having broader connections/applications (e.g., to 
open problems including capacitated facility loca- 
tion [36]), and are studying these directions. 

2 Random Matchings with Linear 
Constraints, and GAP with Capac- 
ity Constraints 

We develop an efficient scheme to generate 
random subgraphs of bipartite graphs that satisfy 
hard degree-constraints and near-optimally satisfy 
a collection of linear constraints: 

Theorem 1 Let G = {J, M, E) be a bipartite 
graph with "jobs" J and "machines" M. Let T 
be the collection of edge-indexed vectors y (with 
Uij denoting y e where e — £ E). Suppose 

we are given: (i) an integer requirement rj for each 
j G J and an integer capacity bi for each i £ M; 
(ii) for each i 6 M, a linear objective function 
/i : 5ft given by fi(y) = J2 3 -. (i,j)eE PijVhi 
such that < pi.j < ii for each j, and (Hi) a vec- 
tor x £ T with x e £ [0, 1] for each e. Then, we 
can efficiently construct a random subgraph of G 
given by a binary vector X £ T , such that: (a) 
with probability one, each j £ J has degree at 
least rj, each i £ M has degree at most bi, and 
\fi{X) - fi(x)\ < ii Vi; and (b) for all e £ E, 
E[X e ] =x e . 

We will now prove an important special case of 
Theorem [TJ GAP with individual capacity con- 
straints on each machine. This special case cap- 
tures much of the essence of Theorem [TJ the full 
proof of Theorem [TJ is deferred to the final ver- 
sion of this work. The capacity constraint specifies 



the maximum number of jobs that can be sched- 
uled on any machine, and is a hard constraint. 
Formally the problem is as follows, where Xij 
is the indicator variable for job j being sched- 
uled on machine i. Given m machines and n 
jobs, where job j requires a processing time of 
Pi.j in machine i and incurs a cost of Cij if as- 
signed to i, the goal is to minimize the makespan 
T = maXiJ2j x i.jPi.j> subject to the constraint 
that the total cost J2i j x i.j°i,3 K a * most G and 
for each machine i, Y^j %ij < W- G is the given 
upper bound on total cost and b{ is the capacity of 
machine i, that must be obeyed. 

Our main contribution here is an efficient algo- 
rithm Sched-Cap that has the following guarantee, 
generalizing the GAP bounds of 14111 : 

Theorem 2 There is an efficient algorithm 
Sched-Cap that returns a schedule maintaining all 
the capacity constraints, of cost at most C and 
makespan at most 2T, where T is the optimal 
makespan with cost C that satisfies the capacity 
constraints. 

We guess the optimum makespan T by binary 
search as in B34I1 . If pij > T, Xij is set to 0. The 
solution to the following integer program gives the 
optimum schedule: 

/] CjjXjj < C (Cost) 
^ Xi,j = 1 Vj (Assign) 
2_jPi,i x i,o <T\/i (Load) 

3 

< bi Vi (Capacity) 

3 

x it j £ {0, l}V£,j 
Xij = if pij > T 

We relax the constraint "xij £ {0, 1} V(i,j)" 
to "xij £ [0,1] V(i,j)" to obtain the LP relax- 
ation LP-Cap. We solve the LP to obtain the op- 
timum LP solution x* ; we next show how Sched- 
Cap rounds x* to obtain an integral solution within 
the approximation guarantee. 

Note that x* } £ [0, 1] denotes the "fraction" of 
job j assigned to machine i. Initialize X = x* . 
The algorithm is composed of several iterations. 
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The random value of the assignment-vector X at 
the end of iteration h of the overall algorithm is de- 
noted by X h . Each iteration h conducts a random- 
ized update using the RandMove on the polytope 
of a linear system constructed from a subset of the 
constraints of LP-Cap. Therefore, by induction on 
h, we will have for all h) that E \_X^A = x* } . 

Let J and M denote the set of jobs and ma- 
chines, respectively. Suppose we are at the begin- 
ning of some iteration (h + 1) of the overall algo- 
rithm: we are currently looking at the values . 
We will maintain four invariants: 

(11) Once a variable Xi j gets assigned to or 1, it 
is never changed; 

(12) The constraints (Assign) always hold; and 

(13) Once a constraint in (Capacity) becomes 
tight, it remains tight. 

(IV) Once a constraint is dropped in some itera- 
tion, it is never reinstated. 
Iteration (ft, + 1) of Sched-Cap consists of three 
main steps: 

1. Since we aim to maintain (II), let us remove 
all X]fj £ {0, 1}; i.e., we project X h to those co- 
ordinates for which X^ € (0, 1), to obtain 
the current vector Y of "floating" (to-be-rounded) 
variables; let S = (A^Y = Uh) denote the current 
linear system that represents LP-Cap. (Ah is some 
matrix and uu is a vector; we avoid using "Sh" to 
simplify notation.) In particular, the "capacity" of 
machine i in S is its residual capacity b\, i.e., b$ mi- 
nus the number of jobs that have been permanently 
assigned to i thus far. 

2. Let Y € for some v; note that Y G (0, 1)". 
Let Mk denote the set of all machines i for which 
exactly k of the values Yjj are positive. We will 
now drop some of the constraints in S: 

(Dl) for each i € Mi, we drop its load and capac- 
ity constraints from S; 

(D2) for each i s AI2, we drop its load constraint 
and rewrite its capacity constraint as Xij 1 + 
Xi,h < {X£ h + X^J, where j 1: j 2 are the 
two jobs fractionally assigned to i, 

(D3) for each i G (Ms) for which both its load 
and capacity constraints are tight in S, we 
drop its load constraint from S. 

3. Let V denote the polytope defined by this re- 
duced system of constraints. A key claim that is 
proven in Lemma[3]below is that Y is not a vertex 
of V. We now invoke RandMove ( Y, V); this is 



allowable if Y is indeed not a vertex of V. 

The above three steps complete iteration (h + 1). 

It is not hard to verify that the invariants (II)- 
(14) hold true (though the fact that we drop the 
all-important capacity constraint for machines i g 
Mi may look bothersome, a moment's reflection 
shows that such a machine cannot have a tight 
capacity-constraint since its sole relevant job j has 
value Yi t j £ (0, 1)). Since we make at least one 
further constraint tight via RandMove in each it- 
eration, invariant (14) shows that we terminate, and 
that the number of iterations is at most the ini- 
tial number of constraints. Let us next present 
Lemma |3] a key lemma: 

Lemma 3 In no iteration is Y a vertex of the cur- 
rent poly tope V . 

Proof. Suppose that in a particular iteration, Y 
is a vertex of V. Fix the notation v, M& etc. w.r.t. 
this iteration; let m/- = |Mjfe|, and let n' denote 
the remaining number of jobs that are yet to be as- 
signed permanently to a machine. Let us lower- 
and upper-bound the number of variables v. On 
the one hand, we have v = Y^k>i ^ ' TOfc ' by def- 
inition of the sets M^; since each remaining job j 
contributes at least two variables (co-ordinates for 
Y), we also have v > 2n'. Thus we get 

v>n' + ^2(k/2)-m k . (1) 

k>l 

On the other hand, since Y has been assumed to 
be a vertex of V, the number t of constraints in 
V that are satisfied tightly by Y, must be at least 
v. How large can t be? Each current job con- 
tributes one (Assign) constraint to t; by our "drop- 
ping constraints" steps (Dl), (D2) and (D3) above, 
the number of tight constraints ("load" and/or "ca- 
pacity") contributed by the machines is at most 
TO2 + 7713 + X)fc>4 2w/c- Thus we have 

v < t < ri + ?7i2 + 7TI3 + 2mfc. (2) 

fc>4 

Comparison of ([T} and (0 and a moment's re- 
flection shows that such a situation is possible only 
if: (i) mi = mz — and 777,5 = m e = ■ • • = 0; (ii) 
the capacity constraints are tight for all machines 
in M2 U M4 - i.e., for all machines; and (iii) t = v. 
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However, in such a situation, the t constraints in V 
constitute the tight assignment constraints for the 
jobs and the tight capacity constraints for the ma- 
chines, and are hence linearly dependent (since the 
total assignment "emanating from" the jobs must 
equal the total assignment "arriving into" the ma- 
chines). Thus we reach a contradiction, and hence 
Y is not a vertex of V. □ 

We next show that the final makespan is at most 
2T with probability one: 

Lemma 4 Let X denote the final rounded vec- 
tor. Algorithm Sched-Cap returns a schedule, 
where with probability one: (i) all capacity- 
constraints on the machines are satisfied, and 
(ii) for all i, V r ; .V, ,/<,., < V,. r; ; /-,., + 

maX jEJ: xfj£(0,l)Pi,j- 

Proof. Part (i) mostly follows from the fact that 
we never drop any capacity constraint; the only 
care to be taken is for machines i that end up in Mi 
and hence have their capacity-constraint dropped. 
However, as argued soon after the description of 
the three steps of an iteration, note that such a ma- 
chine cannot have a tight capacity-constraint when 
such a constraint was dropped; hence, even if the 
remaining job j got assigned finally to i, its capac- 
ity constraint cannot be violated. 

Let us now prove (ii). Fix a machine i. If at all 
its load-constraint was dropped, it must be when i 
ended up in Mi, Mi or M3. The case of M\ is ar- 
gued as in the previous paragraph. So suppose % € 
Mi for some £ G {2, 3} when its load constraint 
got dropped. Let us first consider the case £ = 2. 
Let the two jobs fractionally assigned on i at that 
point have processing times (pi , P2) and fractional 
assignments (2/1,2/2) on i, where < pi,P2 < T, 
and < 2/1, 2/2 < 1- If 2/1 + 2/2 < 1, we know that 
at the end, the assignment vector X will have at 
most one of X\ and X2 being one. Simple algebra 
now shows that piXi + piXi < p\y\ + P2V2 + 
max{pi,p2} as required. If 1 < 2/1 + 2/2 < 2, 
then both X\ and X2 can be assigned and again, 
p 1 X 1 +p 2 X 2 < P1V1 +P2V2 +max{pi,p 2 }. For 
the case £ = 3, we know from (13) and (D3) that 
its capacity-constraint must be tight at some inte- 
gral value u at that point, and that this capacity- 
constraint was preserved until the end. We must 
have c = 1 or 2 here. Let us just consider the case 



c = 2; the case of c = 1 is similar to the case 
of i = 2 with 2/1+2/2 < L Here again, sim- 
ple algebra yields that if < Pi,P2,P3 < T and 
< 2/1, 2/2, 2/3 < 1 with 2/1 + 2/2 + 2/3 = c = 2, 
then for any binary vector (X\, X2, X3) of Ham- 
ming weight c = 2, piXi + p 2 X 2 + P3X 3 < 
PWl +P2V2 +P3V3 + max{pi,p 2 ,P3}- D 

Finally we have the following lemma. 

Lemma 5 Algorithm Sched-Cap can be deran- 
domized to create a schedule of cost at most C. 

Proof. Let X^ denote the value of Xij at iter- 
ation h. We know for all i,j,h, E[X^-) = x* 
where x* ^ is solution of LP-Cap. Therefore, at 
the end, we have that the total expected cost in- 
curred is C. The procedure can be derandomized 
directly by the method of conditional expectation, 
giving an 1 -approximation to cost. □ 

Lemmas 2] and |5] yield Theorem[2] 

3 Scheduling with Outliers 

In this section, we consider GAP with outliers 
and with a hard profit constraint [29]. Formally, 
the problem is as follows, where the indica- 

tor variable for job j to be scheduled on machine 
i. Given m machines and n jobs, where job j re- 
quires processing time of pij in machine i, incurs 
a cost of Cij if assigned to i and provides a profit of 
7Tj if scheduled, the goal is to minimize the make- 
span, T = max; J2j x i.jPi,j> subject to the con- 
straint that the total cost j XijCij is at most C 
and total profit J2j 71 j x i,j ^ s at l east H- 

Our main contribution here is the following: 

Theorem 6 For any constant e > 0, there is an 
efficient algorithm Sched-Outlier that returns a 
schedule of proht at least U, cost at most C(l + e) 
and makespan at most (2 + e)T, where T is the 
optimal makespan with cost C and proht II. 

Note that this is an improvement over the work 
of lEsill . that constructs a schedule with makespan 
3T with profit n and cost C(l+e). In addition, our 
approach also accommodates fairness, a basic re- 
quirement in dealing with outliers, especially when 
problems have to be run repeatedly. We formu- 
late fairness via stochastic programs that specify 
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for each job j, a lower-bound r.j on the probability 
that it gets scheduled. We adapt our approach to 
honor such requirements: 

Theorem 7 There is an efficient randomized al- 
gorithm that returns a schedule of profit at least LT, 
expected cost at most 2C and makespan at most 
3T and guarantees that for each job j, it is sched- 
uled with probability rj , where T is the optimal 
expected makespan with expected cost C and ex- 
pected profit II. If the fairness guarantee on any 
one job can be relaxed, then for every fixed e > 0, 
there is an efficient algorithm to construct a sched- 
ule that has profit at least II, expected cost at most 
C(l + 1/e) and makespan at most (2 + e)T. 

We defer the proof of Theorem [7] to the full 
version, and focus on Theorem [6] Guess the op- 
timum makespan T by binary search as in B3411 . 
If pij > T, Xij is set to 0. We guess all as- 
signments where Cjj > e'C, with e' = e 2 . 
Any valid schedule can schedule at most 1 /e' pairs 
with assignment costs higher than e'C; since e' 
is a constant, this guessing can be done in poly- 
nomial time. For all with c,.j > e'C, let 
Gi,j G {0, 1} be a correct guessed assignment. 
The solution to the following integer program then 
gives an optimal solution: 

CijXij < C (Cost) 

i,j 

^2 x ^J = Vi V 7 (Assign) 

i 

~y^,Pi,j x i.j < r Vi (Load) 

3 

Yl *jVj ^ n ( profit ) 

3 

Xi>j G {0,1} Vi,i; Vj G {0,1} Vj 
Xij = if Pi, j > T 

— Gij V(i,i) such that Cjj > e'C 

We relax the constraint u Xi j G {0, 1} and yj £ 
{0, 1}" to "xij e [0,l]and'?/j G [0, 1]" to ob- 
tain the LP relaxation LP-Out. We solve the LP 
to obtain an optimal LP solution x* , y* ; we next 
show how Sched-Outlier rounds x* , y* to obtain 
the claimed approximation. 



Note that x* j G [0, 1] denotes the fraction of job 
j assigned to machine i in x* . Initially, J^i x \ j — 
y*. Initialize X — x* . The algorithm is com- 
posed of several iterations; the random values at 
the end of iteration h of the overall algorithm are 
denoted by X . (Since yj is given by the equal- 
ity J^i x i,j' X h is effectively the set of variables.) 
Each iteration h (except perhaps the last one) con- 
ducts a randomized update using RandMove on 
a suitable polytope constructed from a subset of 
the constraints of LP-Out. Therefore, for all h 
except perhaps the last, we have EfX^ 1 ,-] = x* j. 
A variable X^ is said to be floating if it lies in 
(0, 1), and a job is floating if it is not yet finally 
assigned. The subgraph of ( J, M, E) composed of 
the floating edges naturally suggests the fol- 
lowing notation at any point of time: machines of 
"degree" k in an iteration are those with exactly k 
floating jobs assigned fractionally, and jobs of "de- 
gree" k are those assigned fractionally to exactly k 
machines in iteration h. Note that since we allow 
yj < 1, there can exist singleton (i.e., degree-1) 
jobs which are floating. 

Suppose we are at the beginning of some itera- 
tion (h+ 1) of the overall algorithm; so we are cur- 
rently looking at the values Xjf ... We will maintain 
the following invariants: 

(II') Once a variable Xi t j gets assigned to or 1, 
it is never changed; 

(12') If j is not a singleton, then £\ aJjj remains 
at its initial value; 

(13') The constraint (Profit) always holds; 

(14') Once a constraint is dropped, it is never re- 
instated. 

Algorithm Sched-Outlier starts by initializing 
with LP-Out. Iteration (h + 1) consists of four 
major steps. 

1. Since we aim to maintain (II'), we remove all 
X^- G {0, 1}; i.e., we project X h to those co- 
ordinates for which X^ G (0,1), to ob- 
tain the current vector Z of "floating" variables; let 
S = (AhZ = Uh) denote the current linear system 
that represents LP-Out. (A^ is some matrix and 
Uh is a vector.) 

2. Let Z G W for some v; note that Z G (0, l) v . 
Let Mk and denote the set of degree-fc ma- 
chines and degree-/c jobs respectively, with mk = 
|Mfc| and = \Nf.\. We will now drop/replace 
some of the constraints in S: 
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(Dl') for each i 6 Mi, we drop its load constraint 
from S; 

(D2') for each i E Ni, we drop its assignment 
constraint from 5; we include one profit con- 
straint, EjeNt z ^3 = Ejgjvx ^'j^ that 
replaces the constraint (Profit). (Note that 
at this point, the values X^j are some con- 
stants.) 

Thus, the assignment constraints of the single- 
ton jobs are replaced by one profit constraint. 

3. If Z is a vertex of S then define fractional 
assignment of a machine i by hi — J2jeJ 
Define a job j to be tight, if YlieM — 
1. Drop all assignment constraints of the non- 
tight jobs and maintain a single profit constraint, 

£ i6 wiuj w = Ei6WiUJ„ X ij*i> where 

Jat are the nontight jobs. If Z is not yet a vertex of 

the modified S, then while there exists a machine 

i' whose degree d satisfies > (d — 1 — e), drop 

the load constraint on machine i'. 

4. Let V denote the polytope defined by this re- 
duced system of constraints. If Z is not a vertex 
of V, invoke RandMove(Z, V). Else, if there is 
a degree-3 machine with 1 singleton job, assign 
the singleton job and the cheaper (less processing 
time) of the non-singleton jobs to it. If there exist 
two singleton jobs, then discard the one with less 
profit. Assign remaining of the jobs in a way, so 
that each machine gets at most one extra job. 

We now prove a key lemma, which shows when 
in step 4, Z can possibly be a vertex of the poly- 
tope under consideration. Recall that in the bipar- 
tite graph G — (J,M,E), we have in iteration 
(h + 1) that (i,j) € E iff Xfa € (0,1). Any 
job or machine having degree is not part of G. 

Lemma 8 Let m denote the number of machine- 
nodes in G. Ifm > -, then Z is not a vertex of the 
curren t polytope. 

Proof. Let us consider the different possible 
configurations of G, when Z becomes a vertex of 
the polytope V at step 3. There are several cases 
to consider depending on the number of singleton 
floating jobs in G in that iteration. 

Case 1: There is no singleton job: We have 
n\ = 0. Then, the number of constraints in S is 
EQ = J2k>2 m k +Sfc>2 n k- Also the number of 
floating variables is v =X]/c>2 ^ n k- Alternatively, 



v = J2k>i km k- Therefore, v = J2k>2 |( m * + 
n k ) + Z being a vertex of V, v < EQ. 

Thus, we must have, n,k,mk = 0, Vfc > 3 and 
mi = 0. Hence every floating machine has exactly 
two floating jobs assigned to it and every floating 
job is assigned exactly to two floating machines 
(Config-1 of Figure [T]). 

Case 2: There are at least 3 singleton jobs: We 
have n\ > 3. Then the number of linear con- 
straints is EQ = J2k>2 m k + J2k>2 n k + 1- The 
last "1" comes from considering one profit con- 
straint for the singleton jobs. The number of vari- 
ables, v = ^ + Ek>2 IK + n k ) + ^ > 
I + J2k>2 ^( m k+nk) + 2 y-. Hence the system is 
always underdetermined and Z cannot be a vertex 
ofV. 

Case 3: There are exactly 2 singleton jobs: We 
have m = 2. Following similar counting argu- 
ments, we can show that each machine must have 
exactly two floating jobs assigned to it and each 
job except two is assigned to exactly two machines 
fractionally (Config-2 of Figure[TJ. 

Case 4: There is exactly 1 singleton job: We 
have m = 1. Then EQ = J2k>2 m k+J2k>2 n k+ 
1 and v > \ + n 2 + §n 3 + ^ + m 2 + |m 3 + 
Sfc>4 f ( m k + n k)- If Z is a vertex of V ', then 
v < EQ. There are few possible configurations 
that might arise in this case. 

(i) Only one job of degree 3 and one job of de- 
gree 1. All the other jobs have degree 2 and all the 
machines have degree 2. We call this Config-3. 

(ii) Only one machine of degree 3 and one job of 
degree 1. The rest of the jobs and machines have 
degree 2. We call this Config-4. 

(iii) Only one machine of degree 1 and one job 
of degree 1. The rest of the jobs and machines have 
degree 2. We call this Config-5. 

These configurations are shown in Fig.[TJ Each 
configuration can have arbitrary number of disjoint 
cycles. 

In any configuration, if there is a cycle with all 
tight jobs, then there always exists a machine with 
total fractional assignment 1 and hence its load 
constraint can always be dropped to make the sys- 
tem underdetermined. So we assume there is no 
such cycle in any configurations. Now suppose 
the algorithm reaches Config-1. If there are two 
non-tight jobs, then the system becomes underde- 
termined. Therefore, there can be at most one non- 
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Figure 1: Different configurations of machine-job bipartite graph at Step 3 and 4 



tight job and only one cycle (say C) with that non- 
tight job. Let C have to machines and thus m jobs. 
Therefore, ^\ , &c Xij > to — 1. Thus there exists 
a machine, such that the total fractional assignment 
of jobs on that machine is > = 1 — If 
to > -, then there exists a machine with total frac- 
tional assignment > (1 — e). Dropping the load 
constraint on that machine makes the system un- 
derdetermined. 

If the algorithm reaches Config-2, then all the 
jobs must be tight for Z to be a vertex. If there 
are m machines, then the number of non-singleton 
jobs is to — 1, If Xi 1 j 1 + Xi 2 j 2 > 1, then fol- 
lowing similar averaging argument as in Config- 
1, we can show the machine with maximum frac- 
tional job assignment, must have a total fractional 
assignment at least 1. Otherwise, if m > again 
the machine with maximum fractional job assign- 
ment, must have a total fractional assignment at 
least 1 — e. For Config-3 and 5, if Z is a ver- 
tex of V, then all jobs must be tight and using 
same argument, there exists a machine with frac- 
tional assignment at least (1 — e) if the algorithm 
reaches Config -3 and there exists a machine with 
fractional assignment 1, if the algorithm reaches 
Config-5. 

If the algorithm reaches Config-4, then again all 
jobs must be tight. If the degree-3 machine has 
fractional assignment at least 2 — e, then its load 



constraint can be dropped to make the system un- 
derdetermined. Otherwise, the total assignment to 
the degree-2 machines from all the jobs in the cy- 
cle is at least to — 2 + e. Therefore, there exists 
at least one degree-2 machine with fractional as- 
signment at least m ~ 2 t e = 1 - > 1 — e, if 

° rn — 1 rn—1 — ' 

□ 



TO > 



This completes the proof. 



We next show that the final profit is at least II 
and the final makespan is at most (2 + e)T: 

Lemma 9 Let X denote the final rounded vec- 
tor. Algorithm Sched-Outlier returns a schedule, 
where with probability one, (i) profit is at least II, 
(ii) for all i, J2 je j x i,jPiJ < Ej x tjPi,j + d 
e)max jeJ: je {o,i}Kj- 



Proof, (i) This essentially follows from the fact 
that whenever assignment constraint on any job 
is dropped, its profit constraint is included in the 
global profit constraint of the system. At step 4, 
except for one case (Config-2), all the jobs are al- 
ways assigned, so profit can not decrease in those 
cases. A singleton job (say ji) is dropped, only 
when G has two singleton jobs fraction- 
ally assigned to ii and £2 respectively, with to- 



tal assignment Xi lt 



< 1. Otherwise 



the system remains underdetermined from Lemma 
[8] Since the job with higher profit is retained, 



IT i-. Xi 



~\~ 7f 19 Xi 



< max{iTj 1 , 7Tj 2 }. 
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(ii) From Lemma [8] and (Dl'), load constraints 
are dropped from machines i 6 Mi and might 
be dropped from machine i 6 M2 U M3. For 
i G Mi, only the remaining job j with X* j > 
0, can get fully assigned to it. Hence for i £ 
Mi, its total load is bounded by J2j x i.jPi,j + 
maXjgj.^. g{ ,i}Pij- For anv machine i e M2 U 
M3, if their degree d (2 or 3) is such that, its frac- 
tional assignment is at least d — 1 — e, then by 
simple algebra, it can be shown that for any such 
machine i, its total load is at most J^j x l jPi,j + 
(1 + e)maXj e j. x * e{o,i}Pi,j- For the remaining 
machines consider what happens at step 4. Except 
when Config-4 is reached, any remaining machine 
i gets at most one extra job, and thus its total load is 
bounded by £\ .'•;,//,., + ma\ jeJ . , • . \... :■!>,..,■ 
When Config-4 is reached at step 4, if the degree- 
3 machine has a fractional assignment at most 1, 
then for any value of m, there will exist a degree-2 
machine whose fractional assignment is 1, giving a 
contradiction. Hence, let ji , j'2 , J3 be the three jobs 
assigned fractionally to the degree-3 machine i and 
let js be the singleton job, and Xij 1 + Xi.j 2 > 1. 
If Pi,ji < Pi,j2> men tne degree-3 machine gets 
ji, js. Else the degree-3 machine gets J2, J3- The 
degree-3 machine gets 2 jobs, but its fractional as- 
signment from ji and ji is already at least 1, Since 
the job with less processing time among ji and ji 
are assigned to i, its increase in load can be at most 
Ej x i,jPi,j + max r x^e{o,i}Pi,i. n 

Finally we have the following lemma. 

Lemma 1 Algorithm Sched-Outlier can be de- 
randomized to output a schedule of cost at most 
C(l + e). 

Proof. In all iterations h, except the last one, 
for all E[Xij] = x*j, where a;* • is solu- 
tion of LP-Out. Therefore, before the last itera- 
tion, we have that the total expected cost incurred 
is C. The procedure can be derandomized directly 
by the method of conditional expectation, giving 
an 1-approximation to cost, just before the last it- 
eration. Now at the last iteration, since at most - 
jobs are assigned and each assignment requires at 
most e'C — e 2 C in cost, the total increase in cost 
is at most eC, giving the required approximation. 

□ 

Lemmas l9l and [TOl yield Theorem[6] 



4 Max-Min Fair Allocation 

We now present our results for max-min fair al- 
location |4l|J,[^[l4j,[l5|]. There are m goods and k 
persons. Each person i has a non-negative integer 
valuation Uij for good j. The valuation functions 

are linear, i.e. u^c = J2jec Ui -i ^ or an y set °^ ^ 
goods. The goal is to allocate each good to a per- 
son such that the "least happy person is as happy 
as possible": i.e., mini Ui,c is maximized. Our al- 
gorithm is based upon rounding the configuration 
LP which is described in Subsectionl4.ll 



Theorem 1 1 Given any feasible solution to the 
configuration LP, it can be rounded to a feasible 
integer solution such that every person gets at least 

io \ ) f rac ti on of the optimum utility with 



log log k 

high probability in polynomial time. 



The 



approximation factor of 0(J log °| gfc ) 
is an improvement of the previous work of 
Jit], that achieved an approximation factor of 
0(Vk\og 3 k); our bound is near-optimal since 
the integrality gap of the configuration LP is 
H(Vfc) |8|]. However, note that the recent work 
of Chakrabarty, Chuzhoy and Khanna H20fl has im- 
proved the bound to m e . (Also note that m > k.) 
Our main point is to show the applicability of our 
types of rounding approaches to this variation of 
the problem as well. 

In the context of fair allocation, an additional 
important criterion can be an equitable partition- 
ing of goods: we may impose an upper bound on 
the number of items a person might receive. For 
example, we may want each person to receive at 
most ["tt] goods. Theorem [1] leads to the follow- 
ing: 

Theorem 12 Suppose, in max-min allocation, 
we are given upper bounds ci on the number of 
items that each person i can receive, in addition 
to the utility values Uij. Let T be the optimum 
max-min allocation value that satisfies Cj for all i. 
Then, we can efficiently construct an allocation in 
which for each person i the bound c, holds and she 
receives a total utility of at least T — maxj Uij. 



This generalizes the result of 11411 . which yields 
the "T — maxj Ui.f value when no bounds such 
as the Ci are given. To our knowledge, the results 
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of J3. El B 15] do not carry over to the setting of 
such "fairness bounds" Cj. 

We now describe the algorithm and the proof 
of Theorem QT| in the next subsection. The ma- 
jor steps of the algorithm are similar to [5], how- 
ever within each step the algorithm uses differ- 
ent rounding techniques; and hence our analysis 
is completely different. Rounding techniques are 
motivated by variants of the dependent rounding 
method. 

4.1 Algorithm for Max-Min Fair Allocation 

We start by describing bipartite dependent 
rounding B27I1 . which is a special case of Rand- 
Move that we have discussed so far. 

4.1.1 Bipartite Dependent Rounding 

Suppose G = (U, V, E) is a bipartite graph with 
U and V being vertices in the two partitions and 
E being the edges between them. The edges E are 
defined by the vector x. If x^j G (0, 1), i G U, j G 
V, then G E and vice-versa. The dependent 
rounding algorithm chooses an even cycle C or a 
maximal path V in G, and partitions the edges in C 
or V into two matchings Mi and Mi- Then, two 
positive scalars a and /? are defined as follows: 

a = mm{r/ > : ((3(i,j) G Mi : x^j + rj = 1) 
V(3(i,i)G^ 2 :x i>j -r ? = 0))}; 

f3 = mm{i] > : ((3(i,j) G A4i : aj»j — 77 = 0) 
V(3(*,i) e M 2 : 1^+77=1))}; 



If X denotes the final rounded variable, then for 
any node v G U\JV, 

ueuyv ueuyv ueuyv 

(5) 

4.1.2 Algorithm 

Our algorithm is based upon roundinga con- 
figuration linear program similar to flUl]. We 
guess the optimum solution- value T, using binary 
search. There is a variable Xj.c for assigning a 
valid bundle C to person i. An item j is said to be 
small for person i, if j < otherwise it is said 
to be big. Here A is the approximation ratio, which 
will get fixed later. A configuration is a subset of 
items. A configuration G is called valid for person 
i, if, 

• Ui.c > T and all the items are small; or 

• G contains only one item j and mj > v> that 
is, j is a big item for person i. 

Let C(i,T) denote the set of all valid configura- 
tions corresponding to person i with respect to T. 
The configuration LP relaxation of the problem is 
as follows: 



Now with probability , set 

Yi,j — x ij + a for all G Mi 
and = — a for all (i, j) G A^; 

with complementary probability of ^rg, set 

Kjj = - /3 for all G A4i 
and K M = iry + /3 for all (i, G Af 2 ; 

The above rounding scheme satisfies the follow- 
ing two properties, which are easy to verify: 

V i,j, E[Yi,j] = x 4> j (3) 

3 i,j, y y e{0,l} (4) 



C3j i 

yi : 2J ^i.C = 1 
CeC(»,T) 
Vz, G : a^c > 

Using an argument similar to J3l, one can show 
that if the above LP is feasible, then it is possible 
to find a fractional allocation that provides a bun- 
dle with value at least (1 — e)T for each person in 
polynomial time. 

We define a weighted bipartite graph G, with the 
vertex set, A [J B corresponding to the persons and 
the items respectively. There is an edge between a 
vertex corresponding to person i G A and item j G 
B, if a configuration G containing j is fractionally 
assigned to i. Define 



C3J 



i,C, 



i.e., Wi ,j is the fraction of item j that is allocated 
to person i by the fractional solution of the LP. We 
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know an item j is big for person i, if u^j > In have, after removing the items with rrij > 1 — e\, 

the remaining valuation is at least (j 1 — jz!^ xj ■ 
Define a random variable Y v j for each remaining 
item such that, 



this case, the edge (i, j) is called a matching edge. 
Otherwise it is called a flow edge. 

Let M and i* 1 represent the set of matching 
and flow edges respectively. For each vertex v G 
A (J B, let m v be the total fraction of the matching 
edges incident to it. Also define f v — I — m v . The 
main steps of the algorithm are shown in Table [T 

Note that these steps are similar to that in [5 
but the rounding techniques within each step are 
different. The rounding techniques are described 
next. 



4.1.3 Finding a Random Matching 

Consider the edges in M. Remove all the edges 
(i, j) that have already been rounded to or 1. Ad- 
ditionally if an edge is rounded to 1, remove both i 
and j (person i is satisfied as Uij > We know 
an edge G M implies that Uij > ?. There- 
fore, even if different items have different utility, 
a person matched to any one item in this stage has 
received at least T/A utility. We initialize for each 



£M, a,., 



and modify the yij values 



probabilistically in rounds using the approach de- 
scribed in Subsubsection l4. 1 ~T1 If Yij denotes the 
final rounded value, then V(i,j) by Property ([3J, 



J %,3 



This gives the following corollary. 



Corollary 13 The probability that a vertex v G 
A[j B is saturated in the matching generated by 
the algorithm is m v . 

Proof. Let the edges ei,e2, ..ei G M 
are incident on v. Then, Pr[v is saturated] = 
Pr[3ei,i G [1,1] s.t v is matched with ej = 

Yli—x Pr[v is matched with ei\ = J2[=i w i = m v 

□ 

4.1.4 Allocating small bundles 

Consider a person v, who is not saturated in 
the matching: how much utility does this person 
v get ? From all the bundles which are fraction- 
ally assigned to person v, remove any item j, with 
rrij > 1 — ei (ei to be fixed later). Since the total 
sum of rrij can be at most k (k = number of per- 
sons), there can be at most t^— items in the bun- 

" 1— ex 

dies with rrij > 1 — e\. Therefore the remaining 
items in the bundle have value fj > t\. Since bun- 
dles only contain small items, and the total valua- 
tion of each bundle (fractionally) is at least T, we 



Y„ 



T/A 



with probability fj 
otherwise 



(7) 



Here w' v j = H^z. Since each person v is not 
saturated by matching with probability 1 — m v = 
/„, each such person v selects bundle C with prob- 
ability x v> c/ fv Thus each item j is selected with 
probability w Vj j/f v = w' VJ . 

Define G v = Y Vt j. Then ■jGy is the total 
fractional assignment to each person after step (3) 
and after doing further processing as suggested in 
the beginning of this subsubsection. We have, 



E 



W v,i u v,jfj 



T/X 



> eiA(l 



eik 



(8) 



(l-ex)A' 

Now we will show in Lemma fT4l that Y v> j's 
are negatively correlated. Therefore, apply- 
ing Chernoff-Hoeffding bound for negatively- 
correlated random variables B37T1 . we get 

Pr[G v < (1 - e)E[G v ]] < exp{-E[G v ]e 2 /3) 

for any e G (0, 1). 
Or we have, 



< exp{-E[G v ]e 2 /3) 



Now substitute, ei = 
We have, 



log k 1 
log log k ^fk 



and A 



Pr[yG„<(l-e)^-, 



A 



'V.J 



fj] 



< 



< 



exp( 
exp( 



-E[C7„]e 2 /3) 
logfc 



0( 



2 log log k 
logfc 



2 /3) 
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Main Steps of the Algorithm for Max-Min Fair Allocation 

(1) Guess the value of the optimum solution T by doing a binary search. Solve LP ©. 
Obtain the set M and m v , /„ for each vertex v in G. 

(2) Select a random matching from edges in M using the algorithm described in Subsubsection l4. 1 3] 
such that for every v e A\f B the probability that v is saturated by the matching is m v = 1 — f v . 

(3) Let every person i who is not matched yet selects a bundle C of small goods with probability ^-p- 
and claims the goods in that bundle, except those already assigned in the previous step. 

(4) For each good j claimed by several persons in the previous step, resolve the contention by 
following the algorithm described in Subsubsection l4. 1 31 



Table 1 : High level description of the algorithm for max-min fair allocation 



Therefore in this step, the net fractional utility 
assigned to each person is > ^ fciSfc^-T* ' 
with probability > 1 — <d(\ogk/k). 

Now we prove that Y v j's are negatively corre- 
lated. 

Lemma 14 The random variables Y v .j, j = 
1, 2, .., n as defined in Equation Q) are negatively 
correlated. 

Proof. Define an indicator random variable 

1 if j is saturated in the matching step 



X, = 



otherwise 



We will show that Vfe 6 {0, 1}, for any subset of 
jobs S, Pr[A^ 6S (Xj = bj\ < n seS p r[Xj = 
b] . This will imply that the Y v j's are negatively 
correlated. 

Fix a subset of items J. Let 6=1 (the proof 
for b = is identical). Consider iteration h. Let 
Hj = J2i Vij> where denote the value of y^j 
at the beginning of iteration h. We will show, 



0-1)1 



vi E[n^]<E[n# 

Thus we will have, 

Pr[/\(X i = l)] 

= E[n< +i ]<E[n4] 



(9) 



nE»j=n pr [^= i ] 



Let us now prove (|9]i for a fixed i. In iteration i, 
exactly one of the following three cases occur: 

Case 1: All the jobs j € J are internal nodes of 
the maximal path. (If it is a cycle all the nodes are 
internal). 

In this case, the value of HjS, j E J, 
do not change. Hence, E[JJ . g j Hj |Case l] < 

Case 2: Exactly one job, say j% 6 J is the end 
point of the maximal path considered in iteration 
h, or has its value modified. 

Let B(ji, a, (3) denote the event that the job j\ 
has its value modified in the following probabilis- 
tic way: 



Hi 



H 
H 



h-l 

h 

h-l 
ji 



a 



with probability 
with probability 



Thus, 



E[23£ |Vj £ J, H';- 1 = a, A B(j u a, /?)] = a h 

Since the values of Hj, j ^ j\ remains unchanged 
and the above equation holds for any ji,a, /3, we 
have the desired result. 

Case 3: Two jobs, say j\ and j% are the end 
points of the maximal path considered in iteration 
i, or have their values modified. 

See Event A of Lemma 2.2 of 12711 . □ 



4.1.5 Contention Resolution 

Consider the subgraph of the flow-graph in 
which an edge between a person and an item 
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remains if and only if the item is claimed by 
the person in the previous step. We showed 
each person has a net fractional utility of at least 



log fc 



f(1 



e(- 



-) in this sub- 



2 Y fcloglogfcv" ~ V ^/k log log fc ' 

graph. The weight on an edge between person v 
and item j in this subgraph is w' v ■ and the util- 
ity of an item j to person v is u v> j. Now we 
again do a kind of dependent rounding on this 
subgraph, where we additionally consider the util- 
ity of the items while modifying the assignment 
values on the edges. This is partly motivated by 
ll46ll . We remove all that have already been 
rounded to or 1. Let F' be the current graph 
consisting of those w[ 3 that lie in (0, 1). Choose 
any maximal path P = (vq, v%, .., v s ) or a cycle 
C = (vq, v\, .., v s = vq). The current w' value of 
an edge e. t — (v t -i,v t ) is denoted by y t , that is 



Vi 



w 



t—l,f 



We will next choose the values z%, 22, .., z s ei- 
ther deterministically or probabilistically, depend- 
ing on whether a cycle or a maximal path is cho- 
sen. We will update the w' value of each edge 
e t = (v t -i,V t ) to y t + z t . 

Suppose we have initialized some value for 
z\ and that we have chosen the increments 
Zi, Z2, ■ ■ ■ , Zt for some t > 1. Then the value 
Zt+i corresponding to the edge e t +i — (v t , Vt+i) 
is chosen as follows: 

(PI) Vt is an item, then v t +i — —v t . (Each item is 

not assigned more than once.) 
(PII) Vt is a person. Then choose Zt+i so that the 

utility of wt remains unchanged. Set Zt+% = 



The vector z = {zi,z%, ..z s ) is completely de- 
termined by z\. We denote this by f(z). 

Now let /i be the smallest positive value such 
that if we get Z\ — /1, then all the w' values (after 
incrementing by the vector z as specified above) 
stay in [0, 1], and at least one of them becomes 
or 1. Similarly, let 7 be the smallest value such 
that if we set z\ = —7, then this rounding progress 
property holds. Now when we are considering, a 
maximal path, we choose the vector z as follows: 
(RPI) with probability — let z = /(/x); 
(RPII) with the complementary probability of 
^,letz = /(- 7 ). 

Therefore in this case, if Z — {Z\, Z2, ..Z s ) de- 
note the random vector z chosen in steps (RPI) 



and (RPII), the choice of probabilities in (RPI) 
and (RPII) ensures that E[Z\] = 0, and since the 
rest are functions of z\ alone, E [Z t ] = for all t. 

Now when we are considering a cycle, assume 
vq is a person. The assignment of Zi values 
ensure all the objects in the cycle are assigned 
exactly once and utility of all the persons ex- 
cept vo remains unaffected. Now the change in 
the value of z s is _ Zl u ^ u y^---^.-i--.-2 If 



> 1, we set z\ 



-7, else 

we set z\ = /i. Therefore the utility of the person 
vo can only increase. 

Let denote the utility assigned to person 
v (fractional and integral) at the end of iteration 
i. The value Y® refers to the initial utilities in 
the flow-graph. Property (PII) and determinis- 
tic rounding scheme while considering a cycle en- 
sures that as long as a person has degree 2 in the 
flow-graph > Y® with probability 1. In par- 
ticular if v never has degree 1, then its final utility 
is same as its initial utility in the flow graph. Sup- 
pose the degree of person v becomes 1 at some 
iteration i and let j be its unique neighbor. Let 
/9 = Uv.j and suppose, at the end of the iteration 
i, the total already rounded utility on person v and 
the value of w' v j are a > and p € (0, 1) respec- 
tively. Note that j, a, f3, p are all random variables 
and that Y* = a + j3p; so, 



Pr[a + (3p>Y v a ] 



1 



Fix any j, a, (3,p such that a + (3p > Yj. In- 
duction on the iterations show that the final utility 
of v is a with probability (1 — p) and a + f3 with 
probability p. Thus the expected utility is a + ftp, 

which is same as the initial utility of -j J fc 

In this process, there are some determinis- 
tic rounding steps interleaved with randomized 
rounding. We can ignore the deterministic round- 
ing steps, since they always increase the utility. 
Define X V j = 1 if w' v ^ > and j was given to v, 
else define it to 0. We can prove similar to Lemma 
[14] that the variables X Vt j's are negatively corre- 
lated. Now since utility of each item is at most 
using the Chernoff-Hoeffding bounds for neg- 
ative correlation 13711 . we get that the net utility is 
concentrated around its expected value with prob- 
er r 

ability > 1 — exp{— 2 



Therefore we get TheoremfTTI 



i- Kg Egj \ 1 log fc 
T/A I > 1 fc ' 
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5 Designing Overlay Multicast Net- 
works For Streaming 

The work of |2|] studies approximation algo- 
rithms for designing a multicast overlay network. 
We first describe the problem and state the results 
in |01 (Lemma[T31and LemmafToTl. Next, we show 
our main improvement in Lemma [P71 

The background text here is largely borrowed 
from [2]. An overlay network can be represented 
as a tripartite digraph TV = (V, E), The nodes 
V are partitioned into sets of entry points called 
sources (S), reflectors (R), and edge-servers or 
sinks (D). There are multiple commodities or 
streams, that must be routed from sources, via re- 
flectors, to the sinks that are designated to serve 
that stream to end-users. Without loss of gen- 
erality, we can assume that each source holds a 
single stream. Now given a set of streams and 
their respective edge-server destinations, a cheap- 
est possible overlay network must be constructed 
subject to certain capacity, quality, and reliabil- 
ity requirements. There is a cost associated with 
usage of every link and reflector. There are ca- 
pacity constraints, especially on the reflectors, that 
dictate the maximum total bandwidth (in bits/sec) 
that the reflector is allowed to send. The quality of 
a stream is directely related to whether or not an 
edge-server is able to reconstruct the stream with- 
out significant loss of accuracy. Therefore even 
though there is some loss threshold associated with 
each stream, at each edge-server only a maximum 
possible reconstruction-loss is allowed. To ensure 
reliability, multiple copies of each stream may be 
sent to the designated edge-servers. 

All these requirements can be captured by an 
integer program. Let us use indicator variable z.- L 
for building reflector i, yi : k for delivery of fc-th 
stream to the i-th reflector and Xi t j t k for deliver- 
ing k-th stream to the j-th sink through the i-th 
reflector. Fi denotes the fanout constraint for each 
reflector i € R. Let p x iV denote the failure prob- 
ability on any edge (source-reflector or reflector- 
sink). We transform the probabilities into weights: 
Wij,k = -log(p k>i +Pij -Pk,iPij)- Therefore, 
Wij } k is the negative log of the probability of a 
commodity k failing to reach sink j via reflector i. 
On the other hand, if tfij is the minimum required 
success probability for commodity k to reach sink 
j, we instead use Wj t k — — log (1 — 4>j,k)- Thus 



Wj k denotes the negative log of maximum al- 
lowed failure. Ti is the cost for opening the reflec- 
tor i and c x is the cost for using the link (x, y) 
to send commodity k. Thus we have the IP (see 
Table©. 

Constraints ( PTOb and ( fTTb are natural consistency 
requirements; constraint (flZb encodes the fanout 
restriction. Constraint (1131 1. the weight constraint, 
ensures quality and reliability. Constraint ([Pil l is 
the standard integrality-constraint that will be re- 
laxed to construct the LP relaxation. 

There is an important stability requirement that 
is referred as color constraint in [2]. Reflectors 
are grouped into m color classes, R = Ri U i?2 U 
. . . U R m . We want each group of reflectors to 
deliver not more than one copy of a stream into a 
sink. This constraint translates to 

x id-k < 1 Vj G D, Vfc e S, VI € [m] (15) 

ien, 

Each group of reflectors can be thought to be- 
long to the same ISP. Thus we want to make sure 
that a client is served only with one - the best 
- stream possible from a certain ISP. This diver- 
sifies the stream distribution over different ISPs 
and provides stability. If an ISP goes down, still 
most of the sinks will be served. We refer the LP- 
relaxation of integer program (Table O with the 
color constraint ( fT"5l ) as LP-Color. 

All of the above is from (2J]. 

The work of 121 uses a two-step rounding proce- 
dure and obtains the following guarantee. 

First stage rounding: Rounds Zi and y^fe for all i 
and k to decide which reflector should be open and 
which streams should be sent to a reflector. The 
results from rounding stage 1 can be summarized 
in the following lemma: 

Lemma 15 (Qj) The first-stage rounding algo- 
rithm incurs a cost at most a factor of 64 log \D\ 
higher than the optimum cost, and with high prob- 
ability violates the weight constraints by at most a 
factor of j and the fanout constraints by at most a 
factor of 2. Color constraints are all satisfied. 

By incurring a factor of 6 (log n) in the cost, the 
constant factors losses in the weights and fanouts 
can be improved. 

Second stage rounding: Rounds safe's using 
the open reflectors and streams that are sent to dif- 
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mill ^ r i Z i + X! X! C k,i,kVi,k + X X X C iJ,kXi,j,k 

iGi? iCRkeS i€RkeSjeD 

s.t y k ,i < Zi Mi G R, Mke S (10) 
a^j,* < Vi,k V« e i?, Vj e D, Mk eS (11) 

X < ( 12 ) 

keSjeD 

X x i,i,kWi, jsk > W 3 . k Mj € D,Mk € S (13) 

iGii 

G {0, 1}, j/ iifc G {0, 1}, Zj G {0, 1} (14) 
Table 2: Integer Program for Overlay Multicast Network Design 



ferent reflectors in the first stage. The results in 
this stage can be summarized as follows: 

Lemma 16 (Qj) The second-stage rounding in- 
curs a cost at most a factor of 14 higher than the 
optimum cost and violates each of fanout, color 
and weight constraint by at most a factor of 7. 

Our main contribution is an improvement of the 
second-stage rounding through the use of repeated 
RandMove and by judicious choices of constraints 
to drop. Let us call the linear program that remains 
just at the end of first stage LP-Color2. More pre- 
cisely, we show: 

Lemma 17 LP-Color2 can be efficiently 
rounded such that cost and weight constraints are 
satisfied exactly, fanout constraints are violated 
at most by additive 1 and color constraints are 
violated at most by additive 3. 

We defer the proof of the above lemma to the 
full version. 

6 Future Directions 

We discuss two speculative directions related to 
our rounding approach that appear promising. 

Recall the notions of discrepancy and linear dis- 
crepancy from the introduction. A well-known re- 
sult here, due to 11211 . is that if A is "t-bounded" 
(every column has at most t nonzeroes), then 
lindisc(A) < t; see lf3~lTl for a closely-related re- 
sult. These results have also helped in the develop- 
ment of improved rounding-based approximation 



algorithms Jgl [47)]. A major open question from 
1 1211 is whether lindisc(A) < 0(y/i) for any t- 
bounded matrix A; this, if true, would be best- 
possible. Ingenious melding of randomized round- 
ing, entropy-based arguments and the pigeonhole 
principle have helped show that lindisc(A) < 
0(y/i log n) fill J35I 0], improved further to 
0{y/t logn) in [7J]. However, the number of 
columns n may not be bounded as a function of t, 
and it would be very interesting to even get some 
o(t) bound on lindisc(A), to start with. We have 
preliminary ideas about using the random-walks 
approach where the subspace S (that is orthogo- 
nal to the set of tight constraints C in our random- 
walks approach) has "large" - O(n) - dimension. 
In a little more detail, whereas the constraints for 
rows i of A are dropped in [12] when there are 
at most t to-be-rounded variables corresponding 
to the nonzero entries of row i, we propose to do 
this dropping at some function such as c^t to-be- 
rounded variables, for a large-enough constant Co 
(instead of at t). This approach seems promising as 
a first step, at least for various models of random 
t-bounded matrices. 

Second, there appears to be a deeper connec- 
tion between various forms of dependent random- 
ized rounding - such as ours - and iterated round- 
ing lE HH S HI SI]. In particular: (i) the re- 
sult that we improve upon in § [2] is based on iter- 
ated rounding 01811 : (ii) certain "budgeted" assign- 
ment problems that arise in keyword auctions give 
the same results under iterated rounding lfl6ll and 
weighted dependent rounding l46ll : and (iii) our 
ongoing work suggests that our random-walk ap- 



16 



proach improves upon the iterated-rounding-based 
work of B28I1 on bipartite matchings that are simul- 
taneously "good" w.r.t. multiple linear objectives 
(this is related to, but not implied by, Theorem[T|). 
We believe it would be very fruitful to understand 
possible deeper links between these two rounding 
approaches, and to develop common generaliza- 
tions thereof using such insight. 
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